Lecture 9: Convolutional Neural Networks¶

Handling image data

Joaquin Vanschoren, Eindhoven University of Technology

Overview¶

  • Image convolution
  • Convolutional neural networks
  • Data augmentation
  • Model interpretation
  • Using pre-trained networks (transfer learning)

Convolution¶

  • Operation that transforms an image by sliding a smaller image (called a filter or kernel ) over the image and multiplying the pixel values
    • Slide an $n$ x $n$ filter over $n$ x $n$ patches of the original image
    • Every pixel is replaced by the sum of the element-wise products of the values of the image patch around that pixel and the kernel
# kernel and image_patch are n x n matrices
pixel_out = np.sum(kernel * image_patch)

ml

  • Different kernels can detect different types of patterns in the image
interactive(children=(IntSlider(value=0, description='i_step', max=783), Output()), _dom_classes=('widget-inte…

Demonstration on Google streetview data¶

House numbers photographed from Google streetview imagery, cropped and centered around digits, but with neighboring numbers or other edge artifacts.

For recognizing digits, color is not important, so we grayscale the images

Demonstration

interactive(children=(IntSlider(value=0, description='i_step', max=1023), Output()), _dom_classes=('widget-int…

Image convolution in practice¶

  • How do we know which filters are best for a given image?
  • Families of kernels (or filter banks ) can be run on every image
    • Gabor, Sobel, Haar Wavelets,...
  • Gabor filters: Wave patterns generated by changing:
    • Frequency: narrow or wide ondulations
    • Theta: angle (direction) of the wave
    • Sigma: resolution (size of the filter)

Demonstration

interactive(children=(FloatSlider(value=0.46, description='frequency', max=1.0, min=0.01, step=0.05), FloatSli…

Demonstration on the streetview data

interactive(children=(FloatSlider(value=0.46, description='frequency', max=1.0, min=0.01, step=0.05), FloatSli…

Filter banks¶

  • Different filters detect different edges, shapes,...
  • Not all seem useful

Another example: Fashion MNIST

Demonstration

interactive(children=(FloatSlider(value=0.46, description='frequency', max=1.0, min=0.01, step=0.05), FloatSli…

Fashion MNIST with multiple filters (filter bank)

Convolutional neural nets¶

  • Finding relationships between individual pixels and the correct class is hard
  • We want to discover 'local' patterns (edges, lines, endpoints)
  • Representing such local patterns as features makes it easier to learn from them
  • We could use convolutions, but how to choose the filters?

ml

Convolutional Neural Networks (ConvNets)¶

  • Instead of manually designing the filters, we can also learn them based on data
    • Choose filter sizes (manually), initialize with small random weights
  • Forward pass: Convolutional layer slides the filter over the input, generates the output
  • Backward pass: Update the filter weights according to the loss gradient
  • Illustration for 1 filter:

ml

Convolutional layers: Feature maps¶

  • One filter is not sufficient to detect all relevant patterns in an image
  • A convolutional layer applies and learns $d$ filter in parallel
  • Slide $d$ filters across the input image (in parallel) -> a (1x1xd) output per patch
  • Reassemble into a feature map with $d$ 'channels', a (width x height x d) tensor.

ml

Border effects (zero padding)¶

  • Consider a 5x5 image and a 3x3 filter: there are only 9 possible locations, hence the output is a 3x3 feature map
  • If we want to maintain the image size, we use zero-padding, adding 0's all around the input tensor.

ml ml

Undersampling (striding)¶

  • Sometimes, we want to downsample a high-resolution image
    • Faster processing, less noisy (hence less overfitting)
  • One approach is to skip values during the convolution
    • Distance between 2 windows: stride length
  • Example with stride length 2 (without padding):

ml

Max-pooling¶

  • Another approach to shrink the input tensors is max-pooling :
    • Run a filter with a fixed stride length over the image
      • Usually 2x2 filters and stride lenght 2
    • The filter simply returns the max (or avg ) of all values
  • Agressively reduces the number of weights (less overfitting)
  • Information from every input node spreads more quickly to output nodes
    • In pure convnets, one input value spreads to 3x3 nodes of the first layer, 5x5 nodes of the second, etc.
    • Without maxpooling, you need much deeper networks, harder to train
  • Increases translation invariance : patterns can affect the predictions no matter where they occur in the image

Convolutional nets in practice¶

  • ConvNets usually use multiple convolutional layers to learn patterns at different levels of abstraction
    • Find local patterns first (e.g. edges), then patterns across those patterns
  • Use MaxPooling layers to reduce resolution, increase translation invariance
  • Use sufficient filters in the first layer (otherwise information gets lost)
  • In deeper layers, use increasingly more filters
    • Preserve information about the input as resolution descreases
    • Avoid decreasing the number of activations (resolution x nr of filters)

Example with Keras:

  • Conv2D for 2D convolutional layers
    • 32 filters (default), randomly initialized (from uniform distribution)
    • Deeper layers use 64 filters
    • Filter size is 3x3
    • ReLU activation to simplify training of deeper networks
  • MaxPooling2D for max-pooling
    • 2x2 pooling reduces the number of inputs by a factor 4
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', 
                        input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

Observe how the input image on 28x28x1 is transformed to a 3x3x64 feature map

  • Convolutional layer:
    • No zero-padding: every output 2 pixels less in every dimension
    • 320 weights: (3x3 filter weights + 1 bias) * 32 filters
  • After every MaxPooling, resolution halved in every dimension
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_3 (Conv2D)           (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 13, 13, 32)       0         
 2D)                                                             
                                                                 
 conv2d_4 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 conv2d_5 (Conv2D)           (None, 3, 3, 64)          36928     
                                                                 
=================================================================
Total params: 55,744
Trainable params: 55,744
Non-trainable params: 0
_________________________________________________________________

Completing the network

  • To classify the images, we still need a Dense and Softmax layer.
  • We need to flatten the 3x3x64 feature map to a vector of size 576
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

Complete network

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_3 (Conv2D)           (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 13, 13, 32)       0         
 2D)                                                             
                                                                 
 conv2d_4 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 conv2d_5 (Conv2D)           (None, 3, 3, 64)          36928     
                                                                 
 flatten (Flatten)           (None, 576)               0         
                                                                 
 dense (Dense)               (None, 64)                36928     
                                                                 
 dense_1 (Dense)             (None, 10)                650       
                                                                 
=================================================================
Total params: 93,322
Trainable params: 93,322
Non-trainable params: 0
_________________________________________________________________

Run the model on MNIST dataset

  • Train and test as usual: 99% accuracy
    • Compared to 97,8% accuracy with the dense architecture
Accuracy:  0.9915

Tip:

  • Training ConvNets can take a lot of time
  • Save the trained model (and history) to disk so that you can reload it later
model.save(os.path.join(model_dir, 'mnist.h5'))
with open(os.path.join(model_dir, 'mnist_history.p'), 'wb') as file_pi:
    pickle.dump(history.history, file_pi)

Cats vs Dogs¶

  • A more realistic dataset: Cats vs Dogs
    • Colored JPEG images, different sizes
    • Not nicely centered, translation invariance is important
  • Preprocessing
    • Create balanced subsample of 4000 colored images
      • 2000 for training, 1000 validation, 1000 testing
    • Decode JPEG images to floating-point tensors
    • Rescale pixel values to [0,1]
    • Resize images to 150x150 pixels

Data generators¶

  • ImageDataGenerator: allows to encode, resize, and rescale JPEG images
  • Returns a Python generator we can endlessly query for batches of images
  • Separately for training, validation, and test set
train_generator = ImageDataGenerator(rescale=1./255).flow_from_directory(
        train_dir,              # Directory with images
        target_size=(150, 150), # Resize images 
        batch_size=20,          # Return 20 images at a time
        class_mode='binary')    # Binary labels

Since the images are larger and more complex, we add another convolutional layer and increase the number of filters to 128.

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_6 (Conv2D)           (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d_4 (MaxPooling  (None, 74, 74, 32)       0         
 2D)                                                             
                                                                 
 conv2d_7 (Conv2D)           (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 36, 36, 64)       0         
 2D)                                                             
                                                                 
 conv2d_8 (Conv2D)           (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 17, 17, 128)      0         
 2D)                                                             
                                                                 
 conv2d_9 (Conv2D)           (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_7 (MaxPooling  (None, 7, 7, 128)        0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 6272)              0         
                                                                 
 dense_2 (Dense)             (None, 512)               3211776   
                                                                 
 dense_3 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________

Training¶

  • The fit function also supports generators
    • 100 steps per epoch (batch size: 20 images per step), for 30 epochs
    • Provide a separate generator for the validation data
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])
history = model.fit(
      train_generator, steps_per_epoch=100,
      epochs=30, verbose=0,
      validation_data=validation_generator,
      validation_steps=50)

Results¶

  • The network seems to be overfitting. Validation accuracy is stuck at 75% while the training accuracy reaches 100%
  • There are many things we can do:
    • Regularization (e.g. Dropout, L1/L2, Batch Normalization,...)
    • Generating more training data
    • Meta-learning: Use pretrained rather than randomly initialized filters

Data augmentation¶

  • Generate new images via image transformations
    • Images will be randomly transformed every epoch
  • We can again use a data generator to do this
datagen = ImageDataGenerator(
      rotation_range=40,     # Rotate image up to 40 degrees
      width_shift_range=0.2, # Shift image left-right up to 20% of image width
      height_shift_range=0.2,# Shift image up-down up to 20% of image height
      shear_range=0.2,       # Shear (slant) the image up to 0.2 degrees
      zoom_range=0.2,        # Zoom in up to 20%
      horizontal_flip=True,  # Horizontally flip the image
      fill_mode='nearest')

Example

We also add Dropout before the Dense layer

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

(Almost) no more overfitting!

Interpreting the model¶

  • Let's see what the convnet is learning exactly by observing the intermediate feature maps
    • A layer's output is also called its activation
  • We can choose a specific test image, and observe the outputs
  • We can retrieve and visualize the activation for every filter for every layer
  • Layer 0: has activations of resolution 148x148 for each of its 32 filters
  • Layer 2: has activations of resolution 72x72 for each of its 64 filters
  • Layer 4: has activations of resolution 34x34 for each of its 128 filters
  • Layer 6: has activations of resolution 15x15 for each of its 128 filters
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_10 (Conv2D)          (None, 148, 148, 32)      896       
                                                                 
 max_pooling2d_8 (MaxPooling  (None, 74, 74, 32)       0         
 2D)                                                             
                                                                 
 conv2d_11 (Conv2D)          (None, 72, 72, 64)        18496     
                                                                 
 max_pooling2d_9 (MaxPooling  (None, 36, 36, 64)       0         
 2D)                                                             
                                                                 
 conv2d_12 (Conv2D)          (None, 34, 34, 128)       73856     
                                                                 
 max_pooling2d_10 (MaxPoolin  (None, 17, 17, 128)      0         
 g2D)                                                            
                                                                 
 conv2d_13 (Conv2D)          (None, 15, 15, 128)       147584    
                                                                 
 max_pooling2d_11 (MaxPoolin  (None, 7, 7, 128)        0         
 g2D)                                                            
                                                                 
 flatten_2 (Flatten)         (None, 6272)              0         
                                                                 
 dropout (Dropout)           (None, 6272)              0         
                                                                 
 dense_4 (Dense)             (None, 512)               3211776   
                                                                 
 dense_5 (Dense)             (None, 1)                 513       
                                                                 
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0
_________________________________________________________________
2022-03-23 01:34:37.270439: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
  • To extract the activations, we create a new model that outputs the trained layers
    • 8 output layers in total (only the convolutional part)
  • We input a test image for prediction and then read the relevant outputs
layer_outputs = [layer.output for layer in model.layers[:8]]
activation_model = models.Model(inputs=model.input, outputs=layer_outputs)
activations = activation_model.predict(img_tensor)

Output of the first Conv2D layer, 3rd channel (filter):

  • Similar to a diagonal edge detector
  • Your own channels may look different

Output of filter 16:

  • Cat eye detector?

The same filter responds quite differently for other inputs

  • First 2 convolutional layers: various edge detectors
  • 3rd convolutional layer: increasingly abstract: ears, eyes
  • Last convolutional layer: more abstract patterns
  • Empty filter activations: input image does not have the information that the filter was interested in
  • Same layer, with dog image input
    • Very different activations

Spatial hierarchies¶

  • Deep convnets can learn spatial hierarchies of patterns
    • First layer can learn very local patterns (e.g. edges)
    • Second layer can learn specific combinations of patterns
    • Every layer can learn increasingly complex abstractions

ml

Visualizing the learned filters¶

  • The filters themselves can be visualized by finding the input image that they are maximally responsive to
  • gradient ascent in input space : start from a random image, use loss to update the pixel values to values that the filter responds to more strongly
from keras import backend as K
    input_img = np.random.random((1, size, size, 3)) * 20 + 128.
    loss = K.mean(layer_output[:, :, :, filter_index])
    grads = K.gradients(loss, model.input)[0] # Compute gradient
    for i in range(40): # Run gradient ascent for 40 steps
        loss_v, grads_v = K.function([input_img], [loss, grads])
        input_img_data += grads_v * step
  • Learned filters of second convolutional layer
  • Mostly general, some respond to specific shapes/colors
  • Learned filters of last convolutional layer
  • More focused on center, some vague cat/dog head shapes

Let's do this again for the VGG16 network pretrained on ImageNet (much larger)

model = VGG16(weights='imagenet', include_top=False)
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_1 (InputLayer)        [(None, None, None, 3)]   0         
                                                                 
 block1_conv1 (Conv2D)       (None, None, None, 64)    1792      
                                                                 
 block1_conv2 (Conv2D)       (None, None, None, 64)    36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, None, None, 64)    0         
                                                                 
 block2_conv1 (Conv2D)       (None, None, None, 128)   73856     
                                                                 
 block2_conv2 (Conv2D)       (None, None, None, 128)   147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, None, None, 128)   0         
                                                                 
 block3_conv1 (Conv2D)       (None, None, None, 256)   295168    
                                                                 
 block3_conv2 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_conv3 (Conv2D)       (None, None, None, 256)   590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, None, None, 256)   0         
                                                                 
 block4_conv1 (Conv2D)       (None, None, None, 512)   1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, None, None, 512)   0         
                                                                 
 block5_conv1 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, None, None, 512)   2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, None, None, 512)   0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
  • Visualize convolution filters 0-2 from layer 5 of the VGG network trained on ImageNet
  • Some respond to dots or waves in the image

First 64 filters for 1st convolutional layer in block 1: simple edges and colors

Filters in 2nd block of convolution layers: simple textures (combined edges and colors)

Filters in 3rd block of convolution layers: more natural textures

Filters in 4th block of convolution layers: feathers, eyes, leaves,...

Visualizing class activation¶

  • We can also visualize which part of the input image had the greatest influence on the final classification
    • Helpful for interpreting what the model is paying attention to
  • Class activation maps : produce heatmap over the input image
    • Take the output feature map of a convolution layer (e.g. the last one)
    • Weigh every filter by the gradient of the class with respect to the filter

ml

Illustration (cats vs dogs)

  • These were the output feature maps of the last convolutional layer
    • These are flattened and fed to the dense layer
  • Compute gradient of the 'cat' node output wrt. every filter output (pixel) here
    • Average the gradients per filter, use that as the filter weight
  • Take the weighted sum of all filter maps to get the class activation map

More realistic example:

  • Try VGG (including the dense layers) and an image from ImageNet
    model = VGG16(weights='imagenet')
    
    ml
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5
553467904/553467096 [==============================] - 42s 0us/step
553476096/553467096 [==============================] - 42s 0us/step
2022-03-23 01:55:39.908227: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-03-23 01:55:39.908264: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
2022-03-23 01:55:39.922472: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 01:55:39.941500: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 01:55:40.350389: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 01:56:18.141214: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

Preprocessing

  • Load image
  • Resize to 224 x 224 (what VGG was trained on)
  • Do the same preprocessing (Keras VGG utility)
from keras.applications.vgg16 import preprocess_input
img_path = '../images/10_elephants.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0) # Transform to batch of size (1, 224, 224, 3)
x = preprocess_input(x)
  • Sanity test: do we get the right prediction?
preds = model.predict(x)
2022-03-23 01:56:27.473195: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
40960/35363 [==================================] - 0s 1us/step
49152/35363 [=========================================] - 0s 1us/step
Predicted: [('n02504458', 'African_elephant', 0.90988594), ('n01871265', 'tusker', 0.085724816), ('n02504013', 'Indian_elephant', 0.0043471307)]

Visualize the class activation map

True

Superimposed on the original image

Using pretrained networks¶

  • We can re-use pretrained networks instead of training from scratch
  • Learned features can be a generic model of the visual world
  • Use convolutional base to contruct features, then train any classifier on new data
  • Also called transfer learning , which is a kind of meta-learning

ml

  • Let's instantiate the VGG16 model (without the dense layers)
  • Final feature map has shape (4, 4, 512)
    conv_base = VGG16(weights='imagenet', include_top=False, input_shape=(150, 150, 3))
    
2022-03-23 02:00:47.123297: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:00:47.143751: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:00:47.390126: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 150, 150, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 150, 150, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 150, 150, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 75, 75, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 75, 75, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 75, 75, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 37, 37, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 37, 37, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 37, 37, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 37, 37, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 18, 18, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 18, 18, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 18, 18, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 18, 18, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 9, 9, 512)         0         
                                                                 
 block5_conv1 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 4, 4, 512)         0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________

Using pre-trained networks: 3 ways¶

  • Fast feature extraction (similar task, little data)
    • Call predict from the convolutional base to build new features
    • Use outputs as input to a new neural net (or other algorithm)
  • End-to-end tuning (similar task, lots of data + data augmentation)
    • Extend the convolutional base model with a new dense layer
    • Train it end to end on the new data (expensive!)
  • Fine-tuning (somewhat different task)
    • Unfreeze a few of the top convolutional layers, and retrain
      • Update only the more abstract representations

ml

Fast feature extraction (without data augmentation)¶

  • Run every batch through the pre-trained convolutional base
generator = datagen.flow_from_directory(dir, target_size=(150, 150),
        batch_size=batch_size, class_mode='binary')
for inputs_batch, labels_batch in generator:
    features_batch = conv_base.predict(inputs_batch)
2022-03-23 02:00:47.544004: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
  • Build Dense neural net (with Dropout)
  • Train and evaluate with the transformed examples
model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_dim=4 * 4 * 512))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation='sigmoid'))
2022-03-23 02:02:35.218935: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:02:35.242247: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:02:35.332992: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:02:35.339297: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:02:36.526697: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:01.937795: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:01.942388: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:01.947266: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:01.953639: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:01.961470: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:01.967213: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
  • Validation accuracy around 90%, much better!
  • Still overfitting, despite the Dropout: not enough training data
Max val_acc 0.90500003

Fast feature extraction (with data augmentation)¶

  • Simply add the Dense layers to the convolutional base
  • Freeze the convolutional base (before you compile)
    • Without freezing, you train it end-to-end (expensive)
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
conv_base.trainable = False
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 vgg16 (Functional)          (None, 4, 4, 512)         14714688  
                                                                 
 flatten (Flatten)           (None, 8192)              0         
                                                                 
 dense_4 (Dense)             (None, 256)               2097408   
                                                                 
 dense_5 (Dense)             (None, 1)                 257       
                                                                 
=================================================================
Total params: 16,812,353
Trainable params: 2,097,665
Non-trainable params: 14,714,688
_________________________________________________________________
Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
2022-03-23 02:03:02.382293: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:02.405473: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:02.443849: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:02.535511: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:02.556200: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:02.583253: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:03:10.645682: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.125280: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.133105: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.139543: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.146373: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.156532: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.164030: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 02:08:27.202682: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Found 2000 images belonging to 2 classes.
Epoch 1/30
100/100 [==============================] - 12s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2800 - acc: 0.8850 - val_loss: 0.2386 - val_acc: 0.9040
Epoch 2/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2776 - acc: 0.8840 - val_loss: 0.2383 - val_acc: 0.9080
Epoch 3/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2863 - acc: 0.8730 - val_loss: 0.2553 - val_acc: 0.8970
Epoch 4/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2695 - acc: 0.8825 - val_loss: 0.2397 - val_acc: 0.9050
Epoch 5/30
100/100 [==============================] - 11s 108ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2765 - acc: 0.8775 - val_loss: 0.2451 - val_acc: 0.8960
Epoch 6/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2634 - acc: 0.8870 - val_loss: 0.2419 - val_acc: 0.9040
Epoch 7/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2605 - acc: 0.8860 - val_loss: 0.2398 - val_acc: 0.9060
Epoch 8/30
100/100 [==============================] - 11s 111ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2660 - acc: 0.8870 - val_loss: 0.2375 - val_acc: 0.9090
Epoch 9/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2757 - acc: 0.8795 - val_loss: 0.2374 - val_acc: 0.9010
Epoch 10/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2667 - acc: 0.8860 - val_loss: 0.2362 - val_acc: 0.9070
Epoch 11/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2549 - acc: 0.8930 - val_loss: 0.2427 - val_acc: 0.9020
Epoch 12/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2787 - acc: 0.8805 - val_loss: 0.2371 - val_acc: 0.9100
Epoch 13/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2690 - acc: 0.8800 - val_loss: 0.2494 - val_acc: 0.8950
Epoch 14/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2747 - acc: 0.8775 - val_loss: 0.2462 - val_acc: 0.8990
Epoch 15/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2476 - acc: 0.8910 - val_loss: 0.2418 - val_acc: 0.9010
Epoch 16/30
100/100 [==============================] - 11s 108ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2683 - acc: 0.8830 - val_loss: 0.2408 - val_acc: 0.9070
Epoch 17/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2803 - acc: 0.8770 - val_loss: 0.2443 - val_acc: 0.8980
Epoch 18/30
100/100 [==============================] - 11s 108ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2657 - acc: 0.8865 - val_loss: 0.2477 - val_acc: 0.9000
Epoch 19/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2664 - acc: 0.8895 - val_loss: 0.2384 - val_acc: 0.9040
Epoch 20/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2715 - acc: 0.8870 - val_loss: 0.2408 - val_acc: 0.9050
Epoch 21/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2589 - acc: 0.8860 - val_loss: 0.2359 - val_acc: 0.9090
Epoch 22/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2465 - acc: 0.9040 - val_loss: 0.2386 - val_acc: 0.9060
Epoch 23/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2557 - acc: 0.8905 - val_loss: 0.2363 - val_acc: 0.9130
Epoch 24/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2528 - acc: 0.8900 - val_loss: 0.2369 - val_acc: 0.9050
Epoch 25/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2584 - acc: 0.8880 - val_loss: 0.2393 - val_acc: 0.9110
Epoch 26/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2580 - acc: 0.8835 - val_loss: 0.2441 - val_acc: 0.9030
Epoch 27/30
100/100 [==============================] - 11s 108ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2549 - acc: 0.8875 - val_loss: 0.2381 - val_acc: 0.9090
Epoch 28/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2608 - acc: 0.8900 - val_loss: 0.2383 - val_acc: 0.9100
Epoch 29/30
100/100 [==============================] - 11s 110ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2544 - acc: 0.8900 - val_loss: 0.2561 - val_acc: 0.8940
Epoch 30/30
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2536 - acc: 0.8930 - val_loss: 0.2350 - val_acc: 0.9110

We now get about 90% accuracy again, and very little overfitting

Max val_acc 0.906

Fine-tuning¶

  • Add your custom network on top of an already trained base network.
  • Freeze the base network, but unfreeze the last block of conv layers.
for layer in conv_base.layers:
    if layer.name == 'block5_conv1':
        layer.trainable = True
    else:
        layer.trainable = False

Visualized

ml ml

Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(None, 150, 150, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 150, 150, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 150, 150, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 75, 75, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 75, 75, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 75, 75, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 37, 37, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 37, 37, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 37, 37, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 37, 37, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 18, 18, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 18, 18, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 18, 18, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 18, 18, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 9, 9, 512)         0         
                                                                 
 block5_conv1 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 4, 4, 512)         0         
                                                                 
=================================================================
Total params: 14,714,688
Trainable params: 7,079,424
Non-trainable params: 7,635,264
_________________________________________________________________
  • Load trained network, finetune
    • Use a small learning rate, large number of epochs
    • You don't want to unlearn too much: catastrophic forgetting
model = load_model(os.path.join(model_dir, 'cats_and_dogs_small_3b.h5'))
model.compile(loss='binary_crossentropy', 
              optimizer=optimizers.RMSprop(lr=1e-5),
              metrics=['acc'])
history = model.fit(
      train_generator, steps_per_epoch=100, epochs=100,
      validation_data=validation_generator,
      validation_steps=50)
2022-03-23 07:41:30.765760: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:30.854492: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.193247: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.302114: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.347544: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.407512: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.471959: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.555226: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.589126: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.599212: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
Epoch 1/10
2022-03-23 07:41:31.727813: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.765070: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:41:31.799462: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
100/100 [==============================] - ETA: 0s - batch: 49.5000 - size: 20.0000 - loss: 0.2738 - acc: 0.8830
2022-03-23 07:41:40.656035: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
100/100 [==============================] - 13s 120ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2738 - acc: 0.8830 - val_loss: 0.2365 - val_acc: 0.9070
Epoch 2/10
100/100 [==============================] - 11s 112ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2820 - acc: 0.8755 - val_loss: 0.2404 - val_acc: 0.9000
Epoch 3/10
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2769 - acc: 0.8810 - val_loss: 0.2392 - val_acc: 0.8990
Epoch 4/10
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2798 - acc: 0.8790 - val_loss: 0.2392 - val_acc: 0.9030
Epoch 5/10
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2794 - acc: 0.8725 - val_loss: 0.2378 - val_acc: 0.9060
Epoch 6/10
100/100 [==============================] - 11s 108ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2707 - acc: 0.8840 - val_loss: 0.2355 - val_acc: 0.9100
Epoch 7/10
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2826 - acc: 0.8790 - val_loss: 0.2347 - val_acc: 0.9100
Epoch 8/10
100/100 [==============================] - 11s 108ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2758 - acc: 0.8825 - val_loss: 0.2348 - val_acc: 0.9070
Epoch 9/10
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2732 - acc: 0.8845 - val_loss: 0.2364 - val_acc: 0.9060
Epoch 10/10
100/100 [==============================] - 11s 109ms/step - batch: 49.5000 - size: 20.0000 - loss: 0.2702 - acc: 0.8860 - val_loss: 0.2421 - val_acc: 0.8980
2022-03-23 07:43:22.515174: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:43:22.523783: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:43:22.530743: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:43:22.543557: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:43:22.556129: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:43:22.565541: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:43:22.626754: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

Almost 95% accuracy. The curves are quite noisy, though.

Max val_acc 0.91
  • We can smooth the learning curves using a running average
Max val_acc 0.9070000648498535

Finally, evaluate the trained model on the test set. This is consistent with the validation results.

Found 1000 images belonging to 2 classes.
2022-03-23 07:57:21.512593: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:21.566643: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:21.859749: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:21.953090: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:21.999990: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:22.027496: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:22.076795: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:22.129332: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
2022-03-23 07:57:22.168710: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.
test acc: 0.89800006

Take-aways¶

  • Convnets are ideal for attacking visual-classification problems.
  • They learn a hierarchy of modular patterns and concepts to represent the visual world.
  • Representations are easy to inspect
  • Data augmentation helps fight overfitting
  • You can use a pretrained convnet to build better models via transfer learning